Modelling sub-phone insertions and deletions in continuous speech recognition
نویسندگان
چکیده
Recently, an extension to standard hidden Markov models for speech recognition called Hidden Model Sequence (HMS) modelling was introduced. In this approach the relationship between phones used in a pronunciation dictionary and the HMMs used to model these in context is assumed to be stochastic. One important feature of the HMS framework is the ability to handle arbitrary model to phone sequence alignments. In this paper we try to exploit that capability by using two di erent methods to model sub-phone insertions and deletions. Experiments on the Resource Management (RM) corpus and a subset of the Switchboard corpus show that, relative to standard HMM baseline, a reduction word error rate (WER) of 24.3% relative can be obtained on RM and 2.4% absolute on Switchboard.
منابع مشابه
Pronunciation Variation Modelling in a Model of Human Word Recognition
Due to pronunciation variation, many insertions and deletions of phones occur in spontaneous speech. The psycholinguistic model of human speech recognition Shortlist is not well able to deal with phone insertions and deletions and is therefore not well suited for dealing with real-life input. The research presented in this paper explains how Shortlist can benefit from pronunciation variation mo...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملHidden Model Sequence Models for Automatic Speech Recognition
Most modern automatic speech recognition systems make use of acoustic models based on hidden Markov models. To obtain reasonable recognition performance within a large vocabulary framework, the acoustic models usually include a pronunciation model, together with complex parameter tying schemes. In many cases the pronunciation model operates on a phoneme level and is derived independently of the...
متن کاملPhonetic Modelling in the Philips Chinese Continuous Speech Recognition System
We have extended the Philips large vocabulary continuous speech recognition system towards Chinese On the way from our existing Western language technology to Mandarin the rst step was to build a suitable phonetic model This paper describes the development of our phonetic model excluding tones for Mandarin Chinese We will present a systematic comparison of three forms of sub syllabic units for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000